Statistical Phrase-based Machine Translation: Experiments with Brazilian Portuguese

نویسندگان

  • Wilker F. Aziz
  • Thiago A. S. Pardo
  • Ivandré Paraboni
چکیده

Statistical approaches have recently emerged as the main paradigm in Machine Translation (MT) research. In previous work we have shown that results of a simple statistical word-based MT system may be highly comparable to those produced by a rule-based approach for closely-related languages such as Brazilian Portuguese and European Spanish. In this work we take the discussion one step further and present evidence that a more sophisticated (namely, phrase-based) translation model may outperform rulebased translation for this language pair, and additional results of a first experiment in Portuguese/English phrase-based statistical MT.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Factored Translation between Brazilian Portuguese and English

Factored translation is an extension of the state-of-theart phrase-based statistical machine translation (PB-SMT). The main difference in factored translation approach is that a word is not only a token (its surface form) but a vector composed of different information such as lemma, part-of-speech or morphologic/syntactic tags. In this paper we present some experiments carried out to train and ...

متن کامل

Segmentation Strategies to Face Morphology Challenges in Brazilian-Portuguese/English Statistical Machine Translation and Its Integration in Cross-Language Information Retrieval

The use of morphology is particularly interesting in the context of statistical machine translation in order to reduce data sparseness and compensate any lack of training corpus. In this work, we propose several approaches to introduce morphology knowledge into a standard phrase-based machine translation system. We provide word segmentation using two different tools (COGROO and MORFESSOR) which...

متن کامل

An Empirical Study of the Impact of Idioms on Phrase Based Statistical Machine Translation of English to Brazilian-Portuguese

This paper describes an experiment to evaluate the impact of idioms on Statistical Machine Translation (SMT) process using the language pair English/BrazilianPortuguese. Our results show that on sentences containing idioms a standard SMT system achieves about half the BLEU score of the same system when applied to sentences that do not contain idioms. We also provide a short error analysis and o...

متن کامل

Fine-Tuning in Brazilian Portuguese-English Statistical Transfer Machine Translation: Verbal Tenses

This paper describes an experiment designed to evaluate the development of a Statistical Transfer-based Brazilian Portuguese to English Machine Translation system. We compare the performance of the system with the inclusion of new syntactic written rules concerning verbal tense between the Brazilian Portuguese and English languages. Results indicate that the system performance improved compared...

متن کامل

English-Portuguese Biomedical Translation Task Using a Genuine Phrase-Based Statistical Machine Translation Approach

Our approach to produce translations for the ACL-2016 Biomedical Translation Task on the English-Portuguese language pair, in both directions, is described. Own preliminary tests results and final results, measured by the shared task organizers, are also presented.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009